4 research outputs found
Hardness of Bichromatic Closest Pair with Jaccard Similarity
Consider collections and of red and blue sets,
respectively. Bichromatic Closest Pair is the problem of finding a pair from
that has similarity higher than a given
threshold according to some similarity measure. Our focus here is the classic
Jaccard similarity
for .
We consider the approximate version of the problem where we are given
thresholds and wish to return a pair from that has Jaccard similarity higher than if there exists a
pair in with Jaccard similarity at least .
The classic locality sensitive hashing (LSH) algorithm of Indyk and Motwani
(STOC '98), instantiated with the MinHash LSH function of Broder et al., solves
this problem in time if . In
particular, for , the approximation ratio
increases polynomially in .
In this paper we give a corresponding hardness result. Assuming the
Orthogonal Vectors Conjecture (OVC), we show that there cannot be a general
solution that solves the Bichromatic Closest Pair problem in
time for . Specifically, assuming
OVC, we prove that for any there exists an such that
Bichromatic Closest Pair with Jaccard similarity requires time
for any choice of thresholds , that
satisfy